Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
نویسندگان
چکیده
Protein alignments are commonly used to evaluate the similarity of protein residues, and the derived consensus sequence used for identifying functional units (e.g., domains). Traditional consensus-building models fail to account for interpositional dependencies - functionally required covariation of residues that tend to appear simultaneously throughout evolution and across the phylogentic tree. These relationships can reveal important clues about the processes of protein folding, thermostability, and the formation of functional sites, which in turn can be used to inform the engineering of synthetic proteins. Unfortunately, these relationships essentially form sub-motifs which cannot be predicted by simple "majority rule" or even HMM-based consensus models, and the result can be a biologically invalid "consensus" which is not only never seen in nature but is less viable than any extant protein. We have developed a visual analytics tool, StickWRLD, which creates an interactive 3D representation of a protein alignment and clearly displays covarying residues. The user has the ability to pan and zoom, as well as dynamically change the statistical threshold underlying the identification of covariants. StickWRLD has previously been successfully used to identify functionally-required covarying residues in proteins such as Adenylate Kinase and in DNA sequences such as endonuclease target sites.
منابع مشابه
MAVL/StickWRLD: analyzing structural constraints using interpositional dependencies in biomolecular sequence alignments
The increasing availability of structurally aligned protein families has made it possible to use statistical methods to discover regions of interpositional dependencies of residue identity. Such dependencies amongst residues often have structural or functional implications, and their discovery can supply valuable constraints that assist in the refinement of measured, or predicted molecular stru...
متن کاملIdentification of structurally conserved residues of proteins in absence of structural homologs using neural network ensemble
MOTIVATION So far various bioinformatics and machine learning techniques applied for identification of sequence and functionally conserved residues in proteins. Although few computational methods are available for the prediction of structurally conserved residues from protein structure, almost all methods require homologous structural information and structure-based alignments, which still prov...
متن کاملConSeq: the identification of functionally and structurally important residues in protein sequences
MOTIVATION ConSeq is a web server for the identification of biologically important residues in protein sequences. Functionally important residues that take part, e.g. in ligand binding and protein-protein interactions, are often evolutionarily conserved and are most likely to be solvent-accessible, whereas conserved residues within the protein core most probably have an important structural rol...
متن کامل-
The homeobox genes are known to play a crucial role in controlling the development of multicellular organisms. The majority of these genes have been determined to express regulatory proteins act as a regulatory protein. These trans-acting factors regulate the expression of proteins that are necessary during the developmental processes throughout the body. TGIFLX/Y is a homeobox gene and it cont...
متن کاملOptimization of Enzymatic Synthesis of Ampicillin Using Cross-Linked Aggregates of Penicillin G Acylase
Penicillin G acylase from E. coli TA1 was immobilized by Cross-Linked Enzyme Aggregates (CLEA), a new method for immobilization. This biocatalyst and commercial immobilized penicillin G acylase (PGA-450) were used to study the effect of pH, temperature and substrate concentration on the synthesis of ampicillin from phenyl glycine methyl ester (PGME) and 6-aminopenicillanic acid (6-APA). Compare...
متن کامل